Music OCR is the application of optical character recognition to interpret sheet music or printed scores into editable and, often, playable form. Once captured digitally, the music can be saved in commonly used file formats, e.g. MIDI (for playback) and MusicXML (for page layout).
Contents |
Early research into recognition of printed sheet music was performed at the graduate level in the late 1960s at MIT and other institutions.[1] Successive efforts were made to localize and remove musical staff lines leaving symbols to be recognized and parsed. The first commercial music-scanning product, MIDISCAN, was released in 1991 by Musitek corporation.
Unlike OCR of text, where words are parsed sequentially, music notation involves parallel elements, as when several voices are present along with unattached performance symbols positioned nearby. Therefore, the spatial relationship between notes, expression marks, dynamics, articulations and other annotations is an important part of the expression of the music.
Modern music OCR packages have accuracy exceeding 99% when a clean scan is used and the notation is not exceptional (e.g. unfilled voices, non-standard symbology, etc.).[2] Because music notation utilizes dots for staccato marks or to extend the value of a note, artifacts in the scan can lead to interpretation problems.